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ABSTRACT 

The objective of this paper is to review the contemporary video segmentation & tracking methods, organize them 
into different categories, and identify new leanings. Object detection and tracking, in general, is a stimulating problem. 
The detection of moving object is important in many tasks, such as video surveillance & the moving object tracking. 
The design of a system of video surveillance is directed on the automatic identification of the events of interest, 
particularly on tracking & classification of moving objects. Complications in tracking objects can arise due to unexpected 
object motion, the changing appearance patterns of object & the scene, non-rigid object edifices, object-to-object and to 
scene obstructions, and camera motion. Tracking is typically performed in context of the higher-level applications that 
need the location &/or shape of object in every frame. Classically, the assumptions are made to restrain the tracking 
problem in context of particular application. In this examination, we classify the tracking approaches on the basis of object 
& the motion representations used; afford thorough descriptions of representative methods in each class. Furthermore, we 
deliberate the important concerns connected to the racking containing the use of suitable image features, the selection of 
motion representations, and detection of objects. 

KEYWORDS: Object Detection, The Video Segmentation, Video Surveillance, Object Tracking Survey, Review 

1. INTRODUCTION 

Object tracking is an important task within the field of computer vision. The proliferation of high-powered 
computers, the obtain ability of the high quality & low-cost video cameras, & increasing essential for the automated video 
analysis has produced the great deal of interest in the object tracking algorithms. There are mainly 3 key stages in video 
analysis: the detection of interesting moving objects, the tracking of such objects from frame to the frame, & analysis of 
the object tracks to identify their behavior. Consequently, the use of the object tracking is relevant in tasks of: 

• Motion-based recognition, that is, human identification based on gait, automatic object detection, etc.; 

• Automated surveillance, namely, monitoring the scene to identify the suspicious activities suspect events; 

• Video indexing, that is, the automatic explanation & recovery of videos in the multimedia databases; 

• Human-computer interaction, that is, the signal recognition, eye gaze tracking for the data input to computers, 
etc.; 

• Traffic monitoring, that is, real-time gathering of traffic statistics to direct traffic flow. 

• Vehicle navigation, which is, the video-based path planning& difficulty avoidance abilities. 
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In its modest form, tracking can be well-defined as a problem of guessing the trajectory of an object in image 
plane as it changes around a scene. In other words, a follower assigns reliable labels to tracked objects in dissimilar frames 
of video. Furthermore, depending on tracking domain, the tracker can also offer the object-centric info, like the orientation, 
area, /shape of an object. The tracking objects can be multiple due to: 

• Loss of info produced by the projection of 3D world on a 2D image, 

• Noise in images, 

• Complex object motion, 

• Non-rigid or articulated nature of objects, 

• Partial and full object occlusions, 

• Complex object shapes, 

• Scene illumination changes, and 

• Real-time processing necessities. 

One can make simpler tracking by magnificent constraints on motion &/or entrance of the objects. For e.g., nearly 
all tracking algorithms accept that the object motion is flat with no sudden changes. One can additional constrain the object 
motion to be of endless velocity endless acceleration based on the priori info. The prior knowledge about the no. & size of 
objects, / the object entrance & shape, can also be used to make simpler the problem. 

The huge no. of tracking approaches have been proposed which effort to answer these queries for the variation of 
scenarios. The goal of survey is to group tracking approaches into the wide groups & deliver the complete explanations of 
the representative approaches in each category. We desire to give the readers, who need a tracker for a definite application, 
the capability to select the most appropriate tracking algorithm for their specific requirements. Furthermore, we aim to 
recognize novel trends & ideas in tracking community & a hope to offer insight for the improvement of novel tracking 
approaches, the basic framework of the moving object detection for the video surveillance is revealed in figure below. 




Figure 1: Framework for Basic Video Object Detection System 

There has been a growing research interest in video image segmentation over the past decade and towards this 
end, a wide variety of methodologies have been developed [53]-[56]. 



Impact Factor(JCC): 7.2165 



NAAS Rating: 3.63 







Video Segmentation for Moving Object Detection & Tracking- A Review 



19 



The segmentation of video methodologies have broadly used the stochastic image models, mainly the Markov 
Random Field (MRF) model, as a model for the video sequences [57] -[59]. The MRF model has shown to be an actual 
stochastic model for the segmentation of image [60] -[62] because of its quality to the model context dependent things like 
the image pixels & associated features. In the Video segmentation, also the spatial modeling & constraints, the temporal 
constraints are too added to develop the spatio-temporal schemes of image segmentation. 

An algorithm of adaptive clustering has been described [57] where the temporal constraints & the temporal local 
density have been accepted for the smooth transition of the segmentation from frame to frame. The segmentation of spatio- 
temporal has too been functional to the image categorizations [63] with the different techniques of filtering. The extraction 
of moving object& the tracking of same has been attained in the spatio-temporal framework [64] with the Genetic 
algorithm helping as the optimization tool for the image segmentation. 

Newly, the MRF model has been used to the model spatial individuals in each frame [64] & the Distributed 
Genetic algorithm (DGA) has been used to acquire the segmentation. Improved version of the DGA has been proposed 
[58] to acquire the segmentation of video sequences in the spatio-temporal framework. Moreover, the video segmentation 
& the foreground subtraction has been attained using spatio-temporal notion [65]- [66] where the spatial model is Gibbs 
Markov Random Field & the sequential changes are modelled by the combination of Gaussian distributions. Currently, the 
algorithm of automatic segmentation of the foreground objects in video sequence segmentation has been proposed [67]. 

Issues in the Building an Object Tracker 

In the tracking scenario, an object can be well-defined as anything that is of attention for the further analysis. For 
example, fish inside an aquarium, v boats on the sea, planes in the air, vehicles on a road, bubbles in the water, / people 
walking on a road are the set of objects that may be significant to track in a detailed domain. Objects can be signified by 
their shapes & appearances. 




Figure 2: An Object Representation. (A) The Centroid, (B) Multiple Points, 

(C) A Rectangular Patch (D) Elliptical Patch, (E) The Part-Based Multiple Patches, 

(F) An Object Skeleton, (G) Entire Object Contour, (H) The Control Points on 
an Object Contour, (I) An Object Silhouette 

Choosing the right topographies plays a serious role in the tracking. In general, the most wanted property of the 
visual feature is its individuality so that an object can be effortlessly distinguished in feature space. The selection of feature 
is carefully related to object representation. For e.g., the color is used as a feature for the histogram-based entrance 
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representations, while for the contour-based representation, object edges are typically used as a features. In overall, many 
algorithms of tracking use a mixture of these features. 

The algorithms of motion-based segmentation can be categorized whichever based on their motion representations 
based on their clustering standards. Meanwhile the motion representation plays such a vital role in the motion segmentation 
that motion segmentation methods usually focus on design of motion approximation algorithms. Consequently, the motion 
segmentation are the best identified & well-known by motion representation it adopts. Inside each subgroup recognized by 
its motion representation, the approaches are illustrious by their clustering criteria. 




Moving object Moving object 

(a) (b) 

Figure 3: (a) Basic Motion-Based Segmentation & (b) Simplified Spatio-Temporal Segmentation 

Every tracking technique needs an object detection mechanism whichever in every frame / when the object 1st 
appears in video. A common method for the object detection is to use info in the single frame. Though, some object 
detection approaches make use of temporal info computed from an arrangement of the frames to decrease the no. of false 
detections. This temporal info is typically in a form of the frame differencing, which is highlights altering regions in the 
consecutive frames. Given object regions in the image, it is then the task of tracker to complete the object correspondence 
from 1 frame to next to produce the tracks. 




Figure 4: Classification of Tracking Methods for Moving Object Detection in Videos 

The main attributes of such tracking algorithms can be summarized as follows. 

• Feature-Based Dense-Based: In the feature-based approaches, the objects are characterized by the limited no. of 
points such as the corners the salient points, however dense approaches compute the pixel-wise motion. 

• Occlusions: It is Capability to Deal with the Occlusions. 

• Multiple Objects: It Is Capability to Compact with More than 1 Object in Scene. 
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• Spatial Continuity: It is Capacity to Adventure the Spatial Continuity. 

• Temporary Stopping: It is ability to contract with the temporary stop of objects. 

• Robustness: It is ability to contract with the noisy images (in the case of feature based approaches it is the 

location of point to be pretentious by noise but not data association). 

• Sequentially: It is ability to work increasingly, this means for e.g. that the algorithm is capable to achievement 

info that was not current at the beginning of sequence. 

• Missing Data: It is capacity to contract with the missing data. 

• Non-Rigid Object: It is capacity to deal with the non-rigid objects. 

• Camera Model: If it is essential, which camera model is used (the orthographic, para-perspective perspective). 

Structure of Assessment 

The edifice step of this paper is as follows. The Introductory 

Section ends with a brief introduction of Video Segmentation, Object Detection and tracking and its necessity in 
the field of video surveillance. The part A in introduction shows a brief explanation about principle and issues in 
developing an object tracker. 

In Section II, we explain a General review of Mean-shift and Optical Flow based Approaches in video 
segmentation. 

In section III we address the review on recent researches in traditional Active Contour Model, Layer and Region 
based Approaches, in which we have taken a review on video segmentation and object tracking techniques based on these 
approaches. 

Section IV addresses the Stochastic and Statical Approaches, including Markov random field, MAP, PF and 
Expectation maximization algorithm. 

Section V explains Deterministic and feature based Approaches including edge & patch based features, skeleton 
based approaches. 

Section VI elucidates Image Difference and Subtraction based Approaches including the temporal difference & 
image subtraction methods. 

Section VII shows the Supervised and un-supervised Learning Based Approaches. And a general conclusion of 
the paper, discussion regarding review is presented in Section VIII and after that a tabular comparison of different 
researches reviewed in previous sections are enlisted. 

2. MEAN-SHIFT AND OPTICAL FLOW BASED APPROACHES 

Mean-Shift (MS) is a clustering technique in the image segmentation for a joint spatial-color space. Mean-shift 
segmentation technique is used to examine the complex multi- modal feature space & the identification of the feature 
clusters. It is the non-parametric method. It’s the Region of Interest’s (ROI) size & the shape parameters are only allowed 
parameters on the mean-shift process, such as, multivariate density kernel estimator. A 2-step arrangement of the 
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discontinuity preserving filtering & mean-shift clustering is used for the mean-shift image [68, 69]. The MS algorithm is 
modified with the huge no. of hypothesized cluster centres casually selected from the data of given image. The MS 
algorithm goals at discovery of nearest stationary point of original density function of the data &, thus, its usefulness in 
noticing the modes of density. For this determination, each cluster center is stimulated to mean of data dishonest inside the 
multidimensional ellipsoid focused on the cluster center [70]. For the meantime, algorithm builds up the vector, which is 
definite by old & the novel cluster centres. This vector is named as the MS vector & computed iteratively until cluster 
centres do not alteration their positions. Some clusters may get combined during the MS iterations [71, 70, & 68]. The MS- 
based image segmentation (within the sense of video frame’s spatial segmentation) is a direct extension of incoherence 
preservative smoothing algorithm. After the close modes are trimmed as in generic feature space analysis technique [68], 
each pixel is connected with an important mode of joint domain density situated in its neighbourhood. 

MS technique is vulnerable to fall into the local maxima, in the case of clutter occlusion [69]. The CAMShift 
(CMS) is a tracking technique which is an improved form of the MS method. The MS algorithm functions on the Color 
Probability Distributions (CPDs) & the CAMShift (CMS) is an improved form of the MS to contract with the dynamical 
changes of the CPDs. To track colour objects in the video frame orders, the color image data has to be signified as a 
possibility distribution [72, 

73]. To complete this, Bradski used the color histograms in his study [72]. The Coupled CMS algorithm as 
designated in the Bradski’s study [72] (& as the reviewed Francois’s study [73]) is established in the real-time head 
tracking application, which is the part of the Intel OpenCV Computer Vision library [74]. 

There are numerous papers which address the problem of the moving object tracking from image arrangements 
captured from the stationary cameras. Based on earlier work on the video segmentation using the joint space-time -range 
mean shift, authors spread the system to permit the tracking of the moving objects. Large movements of the pdf modes in 
the consecutive image frames are demoralized for the tracking [7 -13]. 

The optical flow the optic flow is the pattern of ostensible motion of the objects, surfaces, & edges in the visual 
scene produced by relative motion between an observer (an eye a camera) & the scene. A research [1] suggests a novel 
technique to precisely estimation the optical-flow for totally automated tracking of the moving-objects within a video 
streams. This technique contains of an occlusion-killer algorithm & the template matching using a segmented regions, 
which is achieved by the skip-labelling algorithm. Takaya, K. [2] presents a presentation of the classical technique of 
active contour (snakes) to real time video environment with the optical flow based technique, where tracking the video 
object is a primary task. The greedy technique that reduces the energy function to inform the snake points, was collective 
with optical flow technique to cope with the condition that the object of attention moves also far outside the reach of an 
edge sensor. 

Many author presents researches by combining optical flow with other efficient schemes like gradient snake, state 
dynamics, kirsch operator. Simultaneous estimation approach as in [3], [4], [5] and [6]. Discrepancy the optical flow 
approaches are broadly used within the computer vision communal. They are confidential as being whichever local, as in 
the Lucas-Kanade technique, global, as in the Horn-Schunck method. As the physical subtleties of an object is integrally 
coupled into behaviour of its image in video stream, in [6], author use such dynamic parameter info in scheming the optical 
flow when tracking the moving object using the video stream. Certainly, author use an improved error function in 
minimization that comprises the physical parameter info. 
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3. ACTIVE CONTOUR MODEL, LAYER AND REGION BASED APPROACHES 

An another image segmentation approach is active contour models (ACMs), which are in scope of edge-based 
segmentation and used on object tracking process, as well. A snake is an energy minimizing spline, which is a kind of 
active contour model. The snake’s energy is based on its shape and location within the image. Desired image properties are 
usually relevant to the local minima of this energy. ACMs, which was suggested by Kaas et. al. [75] in 1987 is also known 
as Snake method. 

A snake can be considered as a group of control points (or snaxel) connected to each other and can easily be 
deformed under applied force. According to the study of Dagher and Tom [76], the situation, in which a snake works the 
most abundant, is the situation where the points are at the adequate distance and the situation, in which the initial position’s 
coordinates are controlled. 

In [14], the new system of semantic video object tracking using the backward region-based organization. It 
contains of 5 elementary stages: region pre-processing, the region-based motion estimation, region extraction, region 
classification & the region post-processing. Promising investigational results have exposed the solid presentation of this 
generic tracking system with the pixel-wise correctness. Another method is to starts with the rough user input VO 
definition. In [15] it then associations each frame's region segmentation & the motion estimation results to build the objects 
of attention for the temporal tracking this object along the time. An active outline model based algorithm is active to 
further fine-tune the object's outline so as to extract the correct object boundary. 

Automatic detection of the Region of Interests (ROIs) in the complex image video, like an angiogram the 
endoscopic neurosurgery video, is a serious task in numerous medical image& the video processing applications. An 
author in [16] finds application of region based approach in such kind of problems. Some of the region based approaches 
are developed by combining it with techniques like Bayes classifier, wavelets etc. as in [17], [18] and [19]. Many studies 
are done in enhancing the traditional active contour models in order to enhance results as in [20], [21], [22], [23], & in 
[24]. As in [20], current a novel moving object detection & the tracking system that strongly fuses infrared & noticeable 
the video within the level set framework. Author too present the idea of flux tensor as a simplification of 3D structure 
tensor for fast & dependable motion detection without the Eigen- decomposition. A new association of Active Contour 
Model (ACM) with the Unscented Kalman Filter (UKF) to track deformable objects in a video sequence present in [23]. A 
novel model of active contour, VSnakes, is presented as a method of segmentation in [24] framework. Associated to the 
actual snake’s algorithm, the semiautomatic video object segmentation with VSnakes algorithm occasioned in the 
enhanced presentation in the terms of video object shape alteration 

4. STOCHASTIC AND STATICAL APPROACHES 

Statistic and stochastic theory is extensively used in motion segmentation field. In fact, the motion segmentation 
can be seen like a classification problem where every pixel has to be confidential as a background / foreground. The 
statistical methods can be further separated dependent on the framework used. The common frameworks are the Maximum 
A posteriori Probability (MAP), the Particle Filter (PF) & the Expectation Maximization (EM). Statistical methods offer 
the general tool that can be used in very dissimilar way contingent on specific method. The MAP is often used in the 
mixture with other methods. For e.g., in [77] is mutual with the Probabilistic Data Association Filter. In [78] the MAP is 
used together with the level sets including motion info. In [79] the MAP frame work is used to association & activity the 



www.tjprc.org 



editor@tjprc.org 



24 



Priya Gupta, Nivedita Kumari & Shikha Gupta 



interdependence between the motion estimation, segmentation & the super resolution. An author in [25] improve the 
compressed domain method. At 1st, the motion vectors are collected over an insufficient frames to improve the motion 
info, which are further spatially interposed to get dense the motion vectors. The last segmentation, using dense motion 
vectors, is attained by applying the Expectation Maximization (EM) algorithm. A block-based affine clustering technique 
is proposed for defining the no. of suitable motion models to be used for EM step & the segmented objects are temporally 
followed to acquire the video objects. There are numerous applications of the EM algorithm & particle filter as in [25, 26, 
27, 28,29, 30 and 31]. 

The stochastic method, which is based on modeling of an image as an understanding of the random process. 
Typically, it is expected that the image intensity originates from the Markov Random Field (MRF) &, consequently, 
contents properties of locality & stationary, i.e. every pixel is only connected to the small set of neighboring pixels & the 
dissimilar regions of image are professed similar. The Markov random field (MRF) are comeunder this classification. A 
revolutionary work on retrieval of the plane image geometry is due to [80]. Their algorithm starts with the detection of 
boundaries of an image objects. The subsequent step is the documentation of the blocked & blocking objects. To this aim, 
[80] had the glowing idea to imitator the natural capacity of the human vision to whole incompletely occluded objects, 
supposed a modal conclusion process defined & studied by Gestalt school of psychology & mainly [81]. The theory is 
applied to a specific model of MRF recognition presented in [82]. 

Author in [32], propose a new video object tracking method based on the kernel density estimation & the Markov 
random Field (MRF). The interested video objects are 1st segmented by user, & the nonparametric model based on the 
kernel density approximation is modified for each video object & the residual background, individually. One popular 
manner in such type of method at the object detection stage, spatial features of the object are removed by the wavelet 
transform, according to the frame difference, moving target is determined. In order to efficiently exploit the temporal 
motion info, the Markov Random Field prior possibility model & the observation field model are recognized, taking 
benefit of the Bayesian criterion, location of the object in consecutive frame is assessed, & it donates to the correct space 
object detection [33]. Some author use the Markov random filed in a modified way as done in the literatures [34] [35] and 
[36], 

5. DETERMINISTIC AND FEATURE BASED APPROACHES 

The deterministic method, whose foremost purpose is to improve the geometry of image. Superiority of skeleton 
is that it contains useful information of the shape, especially topological and structural information. To have skeleton of a 
shape, first boundary or edge of the shape is extracted using edge detection algorithms [95] and then its skeleton is 
generated by skeleton extraction methods [96] Medial axis is a type of skeleton that is defined as the locus of centers of 
maximal disks that fit within the shape [97]. In [98] present an algorithm for mechanically estimating the subject’s skeletal 
structure from the optical motion capture the data without using any a priori skeletal model. In [99] other researchers have 
worked on skeleton fitting techniques for use with optical motion capture data. In [100] define partly the automatic method 
for concluding the skeletons from motion. They resolve for the joint positions by finding the center of revolution in inboard 
frame for the markers on outboard segment of every joint. The technique of [101], works with the distance restraints 
although they still trust on the rotation estimates. A rare specific e.g. of approaches for concluding info about a human 
subject’s skeletal structure from motions of bone / skin mounted markers can be establish in [102]. In [103] they have 
available a survey of the standardization by parameter approximation for robotic devices. Few edge and junction based 
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researches also done in the field [106, 107]. The approaches that use the edge-based feature kind excerpt the edge map of 
an image & recognize the topographies of object in terms of the edges. Some e.g. contain [83, 84, 88-101]. Using the edges 
as features is beneficial over other features due to numerous reasons. As discoursed in [88], they are basically invariant to 
the illumination conditions & the variations in the objects' colors & the textures. They too signify the object boundaries 
well & represent the data competently in the large spatial amount of the images. 

The other prevalent feature type i the patch based feature type, which uses appearance as cues. This feature has 
been in use since more than two decades [108], and edge-based features are relatively new in comparison to it. Moravec 
[108] looked for local maxima of minimum intensity gradients, which he called corners and selected a patch around these 
corners. His work was improved by Harris [109], which made the new detector less sensitive to noise, edges, and 
anisotropic nature of the corners proposed in [108]. In this feature type, there are two main variations: 

• Patches of rectangular shapes that contain the characteristic boundaries describing the features of the objects 
[110-1 15]. Usually, these features are referred to as the local features. 

• Irregular patches in which, each patch is homogeneous in terms of intensity or texture and the change in these 
features are characterized by the boundary of the patches. These features are commonly called the region-based 
features. 

6. IMAGE DIFFERENCE AND SUBTRACTION BASED APPROACHES 

The object detection can be attained by building the representation of scene known as the background model & 
then finding the deviations from model for every incoming frame. Any important change in an image region from 
background model indicates the moving object. Typically, the connected component algorithm is applied to acquire the 
connected regions corresponding to objects. This process is mentioned to as background subtraction [116]. At each new 
frame foreground pixels are detected by subtracting intensity values from background and filtering absolute value of 
differences with dynamic threshold per pixel [117]. The threshold and reference background are updated using foreground 
pixel information. Eigen background subtraction [118] proposed by Oliver, et al. It presents that an Eigen space model for 
moving object segmentation. In this method, dimensionality of the space constructed from sample images is reduced by the 
help of Principal Component Analysis (PCA). It is proposed that the reduced space after PCA should represent only the 
static parts of the scene, remaining moving objects, if an image is projected on this space [117]. 

Temporal differencing technique uses the pixel-wise modification between 2 3 consecutive frames in the video 
imagery to remove the moving regions. It is an extremely adaptive method to dynamic scene changes though, it fails to 
remove all applicable pixels of the foreground object particularly when the object has unbroken texture moves slowly 
[119]. A technique for tracking the multiple objects in video arrangements based on the background removal & SIFT 
feature identical where the camera is stable & the input video orders are real time / the self-captured. Object is 
distinguished automatically by the background subtraction, then effective tracking is achieved by perceiving the motion & 
the SIFT feature matching of detected object is described in [37]. Other researches in the including [38, 39, 40, 41, and 42] 
for applications like Multiple human object tracking, video background estimation, a real-time visual tracking system for 
dense traffic intersections, real time tracking for surveillance and security system, etcetera. 
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7. SUPERVISED AND SEMI- SUPERVISED LEARNING BASED APPROACHES 



Object detection can be performed by learning different object views automatically from a set of examples by 
means of a supervised learning mechanism. Learning of different object views waives the requirement of storing a 
complete set of templates. Given a set of learning examples, supervised learning methods generate a function that maps 
inputs to desired outputs. 

A standard formulation of supervised and semi-supervised learning is the classification problem where the learner 
approximates the behavior of a function by generating an output in the form of either a continuous value, which is called 
regression, or a class label, which is called classification. In context of object detection, the learning examples are 
composed of pairs of object features and an associated object class where both of these quantities are manually defined. 
This classifiers includes Support Vector Machines, Neural Networks, and Adaptive Boosting etcetera. 

For detection & tracking of the specific objects in the knowledge- based framework. The arrangement in [43] uses 
a managed learning technique: the support vector machines. Both problems, detection & tracking, are resolved by a 
common method: objects are situated in the video sequences by the SVM classifier. 

A current controlling trend in the tracking known as the tracking- by-detection uses on-line classifiers in 
instruction to redetect objects over the succeeding frames. The semi-supervised learning permits for the incorporating 
priors & is more vigorous in case of obstructions while the multiple-instance learning resolutions the uncertainties where to 
take the positive informs during tracking. In this work, [44] an algorithm of on-line semi-supervised learning which is 
capable to association both of these methods into the coherent framework. There are few learning based methods are come 
in this category including neural networks, support vector machine [45, 46, 47, 48, 49, 50, 51 and 52]. 

8. CONCLUSIONS & DISCUSSIONS 

In this editorial, we present a widespread survey of video segmentation and object tracking methods and also give 
a short- lived review of related matters. We distribute the tracking & the segmentation approaches into 6 groups based on 
use of the object representations, specifically, approaches founding the point correspondence, approaches using the 
primitive geometric models, & the approaches using contour evolution. Note that completely these classes need the object 
detection at some point. For example, the point trackers need detection in each frame, while the geometric region the 
contours-based trackers need detection only when object 1st appears in the scene. Identifying the significance of the object 
detection for tracking systems, we comprise a short discussion on the popular object detection approaches. We offer the 
detailed summaries of object trackers, counting conversation on the object representations, motion models, & parameter 
estimation schemes active by the tracking algorithms. Furthermore, we define the situation of use, degree of the 
applicability, evaluation criteria, & the qualitative list of tracking algorithms. We have self-assurance that, this article, the 

1st survey on the object tracking with an amusing bibliography Content, can give appreciated insight into this 
significant research topic & encourage the novel research. 

One experiment in tracking is to improve the algorithms for a tracking objects in unconstrained videos, for e.g., 
the videos attained from the broadcast news networks the home videos. These videos are noisy, unstructured, compressed, 
& classically contain edited clips developed by moving cameras from the multiple views. Another connected video domain 
is of official & an informal meetings. These videos typically comprise the multiple people in a small field of view. Thus, 



www.tjprc.org 



editor@tjprc.org 



Video Segmentation for Moving Object Detection & Tracking- A Review 



27 



there is simple occlusion, & people are only incompletely visible. One stimulating solution in this context is to 
employment audio in the addition to video for the object tracking. There are certain approaches being established for 
estimating point of location of an audio source, for e.g., a mouth of person, based on 4 6 microphones. This audio-based 
localization of speaker offers the additional info which then can be used in combination with the video-based tracker to 
resolve the problems such as severe occlusion. Complete, we trust that additional sources of info, in particular prior & the 
contextual info should be exploited when imaginable to attune the tracker to specific consequence in which it is used. An 
upright method to integrate these disparate sources of info will result in an overall tracker that can be working with success 
in a variability of applications. 
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